Preamble

Why we are doing this

A script-based language

R is a script-based language. You write down a list of instructions and it will follow, performing one action after another. This is different to ‘point and click’ software like Microsoft Excel, and it can feel a bit cumbersome.

In Excel, you can perform a series of steps:

  • Open a file.
  • Delete column that you don’t need.
  • Add a new column that contains a function.
  • ‘Fill’ that function down to the end of the dataset.
  • Select all of your data and sort from highest to lowest.
  • Delete all the rows that have missing values.
  • Save your file.

R is free, open-source and powerful. In the past five years, it has also become easier to use and get started with.

…

Getting started in R

R Projects

R Studio layout

(screen grabs)

Functions

Installing and loading packages

Installing a package is like installing an app on your phone. It….

You can install a package using the install.packages function. Note that there will be lots of text that appears o

Now we need to load it using the library function; like opening an app you have installed on your phone. We do this every time (every ‘session’) we want to use it.

Notice below that we received some messages and warnings when we loaded the

library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'stringr' was built under R version 3.5.2
## ── Conflicts ─────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Part 1: Reading and exploring with visuals

This

Read a CSV file into R

This uses the read_csv function and, here, we’re only going to give it one argument: the path to the csv file you want to read in quotation marks.

Tip: open quotation marks and hit tab to choose your file (and save you some typing).

## Parsed with column specification:
## cols(
##   country = col_character(),
##   continent = col_character(),
##   year = col_double(),
##   lifeExp = col_double(),
##   pop = col_double(),
##   gdpPercap = col_double()
## )
## # A tibble: 1,704 x 6
##    country     continent  year lifeExp      pop gdpPercap
##    <chr>       <chr>     <dbl>   <dbl>    <dbl>     <dbl>
##  1 Afghanistan Asia       1952    28.8  8425333      779.
##  2 Afghanistan Asia       1957    30.3  9240934      821.
##  3 Afghanistan Asia       1962    32.0 10267083      853.
##  4 Afghanistan Asia       1967    34.0 11537966      836.
##  5 Afghanistan Asia       1972    36.1 13079460      740.
##  6 Afghanistan Asia       1977    38.4 14880372      786.
##  7 Afghanistan Asia       1982    39.9 12881816      978.
##  8 Afghanistan Asia       1987    40.8 13867957      852.
##  9 Afghanistan Asia       1992    41.7 16317921      649.
## 10 Afghanistan Asia       1997    41.8 22227415      635.
## # … with 1,694 more rows

Looks good! But it isn’t in our Environment (on the right) yet because we didn’t assign it to anything.

Show code
## Parsed with column specification:
## cols(
##   country = col_character(),
##   continent = col_character(),
##   year = col_double(),
##   lifeExp = col_double(),
##   pop = col_double(),
##   gdpPercap = col_double()
## )

Now it is in our Global Environment over there —> wooh!

Peeking at the data

Much like Excel, we can explore the gapminder dataset with our eyes.

View will open up a new tab that displays your dataset. You can scroll through it.

head will print just the first few observations. This is handy to check on things as you’re going along.

## # A tibble: 6 x 6
##   country     continent  year lifeExp      pop gdpPercap
##   <chr>       <chr>     <dbl>   <dbl>    <dbl>     <dbl>
## 1 Afghanistan Asia       1952    28.8  8425333      779.
## 2 Afghanistan Asia       1957    30.3  9240934      821.
## 3 Afghanistan Asia       1962    32.0 10267083      853.
## 4 Afghanistan Asia       1967    34.0 11537966      836.
## 5 Afghanistan Asia       1972    36.1 13079460      740.
## 6 Afghanistan Asia       1977    38.4 14880372      786.

names will display the names of all variables in the dataset (and is often the answe to ‘what was that variable called again…’)

## [1] "country"   "continent" "year"      "lifeExp"   "pop"       "gdpPercap"

Exploring further with plotly

Look at the gapminder dataset.

Now close your eyes and picture the gapminder dataset: * Add a new column to the right with the name ‘my_column’. * Only keep rows from 2007 * Then remove the ‘year’ column

Putting it all together with pipes %>%

Changing the look of our plots

We want to make our plots as clear as possible…

## 
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
## 
##     discard
## The following object is masked from 'package:readr':
## 
##     col_factor

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to
## adjust the group aesthetic?

Part 2:

bin

Maths

At some point throughout your university life you will need to write equations in a document.



$A = (r^{4}) / $